Traits database

Quick update and way forward

June 2025

Romain Frelat

Species list

Files in fellow-traits/data/raw-data/species-list



  • Merge files (in analyses/01_get_specieslist.R)
  • Harmonize the spellings with function clean_species_list():
    • remove punctuations: ? . , ()
    • remove sp, spp, cf, complex, …
    • remove author name and years when included
    • make sure only first genus letter is uppercase
    • remove “Arbre”, “Espece”, “Inconue”, “Repousse”, …
    • remove taxa that are coarser than family: Dicotyledonae, Bryophyta

Number of unique taxa: 1884

Species list

Taxonomic backbone

Taxa with no exact match in Taxref: 97
Taxa not found in Taxref with fuzzy matching: 43
Taxa not found in GBIF: 1 (Ornithogalum muscari)
Number of accepted taxa: 1706


    FAMILY      GENUS    SPECIES SUBSPECIES    VARIETY 
        12        197       1430         62          5 

Synonyms

43k known synonyms were retrieve from
original species list, TaxRef and GBIF.

specialist: a taxa listed in only one database
generalist: a taxa listed in 50% of the databases

sp_class
specialist      other generalist 
       773        789        144 

Trait databases

Trait coverage per database

Trait coverage per database

Trait coverage per database

Trait coverage per taxonomic rank




Taxa with no trait information.

Acacia
Agrimonia agrimonoides
Agropyron
Amaranthaceae
Apiaceae
Asparagaceae
Brassicaceae
Caryophyllaceae
Crambe abyssinica
Dysphania aristata
Glyceria
Lamiaceae
Liliaceae
Paronychia
Piptatherum
Poaceae
Roemeria hispida
Rosaceae
Rubiaceae
Viburnum

Trait coverage per species frequency

Trait completness

Comparison - Plant height (m)

   spvignes FlorealData    Lososova        BIEN        GIFT    Ecoflora 
       1582        1345         129         610         492         851 
     filled 
        102 

Comparison - SLA

Hodgson    GIFT    BIEN  filled 
    862     731     720     475 

Comparison - Seed mass (mg)

Lososova     BIEN     GIFT Ecoflora Biolflor   filled 
     238      589      284     1072     1289      176 

Comparison - Life form

Way forward - open questions


Missing trait information?

  • which traits are missing? where to find them?
  • relevant trait database missing?

How to get clean and complete trait database?

  • do we need information for all species? can we discard the rare taxa / non-weed taxa?
  • how to combine trait information from multiple sources? biases?
  • can we measure missing trait? can we guesstimate some of them?
  • should we do trait imputation based on other traits or taxonomy?
  • most categorical traits should be cleaned (misspellings, irrelevant categories, …)

What’s next?

  • a first stepping stone for multiple sub-projects
  • in itself, is it a publishable output?
  • how to continue? who takes the lead from now?